Search results for "Computer cluster"

showing 9 items of 9 documents

parSRA: A framework for the parallel execution of short read aligners on compute clusters

2018

The growth of next generation sequencing datasets poses as a challenge to the alignment of reads to reference genomes in terms of both accuracy and speed. In this work we present parSRA, a parallel framework to accelerate the execution of existing short read aligners on distributed-memory systems. parSRA can be used to parallelize a variety of short read alignment tools installed in the system without any modification to their source code. We show that our framework provides good scalability on a compute cluster for accelerating the popular BWA-MEM and Bowtie2 aligners. On average, it is able to accelerate sequence alignments on 16 64-core nodes (in total, 1024 cores) with speedup of 10.48 …

0301 basic medicineSource codeSpeedupGeneral Computer ScienceComputer sciencemedia_common.quotation_subjectParallel computingSupercomputerTheoretical Computer Science03 medical and health sciences030104 developmental biology0302 clinical medicine030220 oncology & carcinogenesisModeling and SimulationComputer clusterScalabilityFuse (electrical)Node (circuits)Partitioned global address spacemedia_commonJournal of Computational Science

researchProduct

Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters

2016

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data par…

0301 basic medicineXeon Phi clustersComputer scienceData parallelismParallel algorithm02 engineering and technologyDynamic programmingBiochemistryPairwise sequence alignmentComputational science03 medical and health sciencesStructural BiologyComputer cluster0202 electrical engineering electronic engineering information engineeringAmino Acid SequenceDatabases ProteinMolecular Biology020203 distributed computingResearchApplied MathematicsComputational BiologyProteinsSmith-WatermanComputer Science Applications030104 developmental biologyMultiple sequence alignmentDatabases Nucleic AcidSequence AlignmentAlgorithmsSoftwareXeon PhiBMC Bioinformatics

researchProduct

Analyzing the performance of a cluster-based architecture for immersive visualization systems

2008

Cluster computing has become an essential issue for designing immersive visualization systems. This paradigm employs scalable clusters of commodity computers with much lower costs than would be possible with the high-end, shared memory computers that have been traditionally used for virtual reality purposes. This change in the design of virtual reality systems has caused some development environments oriented toward shared memory computing to require modifications to their internal architectures in order to support cluster computing. This is the case of VR Juggler, which is considered one of the most important virtual reality application development frameworks based on open source code. Thi…

Computer Networks and Communicationsbusiness.industryComputer scienceVirtual realityModular designcomputer.software_genreTheoretical Computer ScienceVisualizationShared memoryArtificial IntelligenceHardware and ArchitectureHuman–computer interactionComputer clusterScalabilityCluster (physics)Operating systemArchitecturebusinesscomputerSoftwareJournal of Parallel and Distributed Computing

researchProduct

Analyzing big datasets of genomic sequences: fast and scalable collection of k-mer statistics

2019

Abstract Background Distributed approaches based on the MapReduce programming paradigm have started to be proposed in the Bioinformatics domain, due to the large amount of data produced by the next-generation sequencing techniques. However, the use of MapReduce and related Big Data technologies and frameworks (e.g., Apache Hadoop and Spark) does not necessarily produce satisfactory results, in terms of both efficiency and effectiveness. We discuss how the development of distributed and Big Data management technologies has affected the analysis of large datasets of biological sequences. Moreover, we show how the choice of different parameter configurations and the careful engineering of the …

Data AnalysisFOS: Computer and information sciencesTime FactorsTime FactorComputer scienceStatistics as TopicBig dataApache Spark; distributed computing; performance evaluation; k-mer countinglcsh:Computer applications to medicine. Medical informaticsBiochemistryDomain (software engineering)Databases03 medical and health sciences0302 clinical medicineStructural BiologyComputer clusterStatisticsSpark (mathematics)Molecular Biologylcsh:QH301-705.5030304 developmental biology0303 health sciencesGenomeSettore INF/01 - InformaticaBase SequenceNucleic AcidApache Sparkbusiness.industryResearchApache Spark; Distributed computing; k-mer counting; Performance evaluation; Algorithms; Base Sequence; Software; Time Factors; Data Analysis; Databases Nucleic Acid; Genome; Statistics as TopicApplied Mathematicsk-mer countingDistributed computingComputer Science ApplicationsAlgorithmData AnalysiComputer Science - Distributed Parallel and Cluster Computinglcsh:Biology (General)030220 oncology & carcinogenesisScalabilityPerformance evaluationlcsh:R858-859.7Algorithm designDistributed Parallel and Cluster Computing (cs.DC)Databases Nucleic AcidbusinessAlgorithmsSoftware

researchProduct

LPCC

2019

Most high-performance computing (HPC) clusters use a global parallel file system to enable high data throughput. The parallel file system is typically centralized and its storage media are physically separated from the compute cluster. Compute nodes as clients of the parallel file system are often additionally equipped with SSDs. The node internal storage media are rarely well-integrated into the I/O and compute workflows. How to make full and flexible use of these storage media is therefore a valuable research question. In this paper, we propose a hierarchical Persistent Client Caching (LPCC) mechanism for the Lustre file system. LPCC provides two modes: RW-PCC builds a read-write cache on…

File systemComputer scienceComputer clusterHierarchical storage management0202 electrical engineering electronic engineering information engineeringOperating system020206 networking & telecommunications020207 software engineeringLustre (file system)02 engineering and technologyCachecomputer.software_genrecomputerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

researchProduct

File system scalability with highly decentralized metadata on independent storage devices

2016

This paper discusses using hard drives that integrate a key-value interface and network access in the actual drive hardware (Kinetic storage platform) to supply file system functionality in a large scale environment. Taking advantage of higher-level functionality to handle metadata on the drives themselves, a serverless system architecture is proposed. Skipping path component traversal during the lookup operation is the key technique discussed in this paper to avoid performance degradation with highly decentralized metadata. Scalability implications are reviewed based on a fuse file system implementation. Peer Reviewed

Information storage and retrieval systemsComputer scienceDistributed computingInterface (computing)Key-value storages02 engineering and technologycomputer.software_genreObject storagesLookupsComputer clusterServerData_FILES0202 electrical engineering electronic engineering information engineering:Informàtica::Arquitectura de computadors [Àrees temàtiques de la UPC]Virtual storageFile systemMetadataFile organizationScalability020206 networking & telecommunications020207 software engineeringFile systemsObject storageMetadataKineticsInformació -- Sistemes d'emmagatzematge i recuperacióScalabilityGrid computingSystems architecturecomputerCluster computing

researchProduct

Heterogeneous PBLAS: Optimization of PBLAS for Heterogeneous Computational Clusters

2008

This paper presents a package, called Heterogeneous PBLAS (HeteroPBLAS), which is built on top of PBLAS and provides optimized parallel basic linear algebra subprograms for heterogeneous computational clusters. We present the user interface and the software hierarchy of the first research implementation of HeteroPBLAS. This is the first step towards the development of a parallel linear algebra package for heterogeneous computational clusters. We demonstrate the efficiency of the HeteroPBLAS programs on a homogeneous computing cluster and a heterogeneous computing cluster.

Kernel (linear algebra)ScaLAPACKComputer scienceComputer clusterLinear algebraCluster (physics)Concurrent computingSymmetric multiprocessor systemParallel computingBasic Linear Algebra SubprogramsComputational science2008 International Symposium on Parallel and Distributed Computing

researchProduct

XLCS: A New Bit-Parallel Longest Common Subsequence Algorithm on Xeon Phi Clusters

2019

Finding the longest common subsequence (LCS) of two strings is a classical problem in bioinformatics. A basic approach to solve this problem is based on dynamic programming. As the biological sequence databases are growing continuously, bit-parallel sequence comparison algorithms are becoming increasingly important. In this paper, we present XLCS, a new parallel implementation to accelerate the LCS algorithm on Xeon Phi clusters by performing bit-wise operations. We have designed an asynchronous IO framework to improve the data transfer efficiency. To make full use of the computing resources of Xeon Phi clusters, we use three levels of parallelism: node-level, thread-level and vector-level.…

Longest common subsequence problemDynamic programmingSpeedupComputer scienceComputer clusterAsynchronous I/OCacheSupercomputerAlgorithmXeon Phi2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)

researchProduct

Tuning of QoS Aware Load Balancing Algorithm (QoS–LB) for Highly Loaded Server Clusters

2001

This paper introduces a novel algorithm for content based switching. A content based scheduling algorithm (QoS Aware Load Balancing Algorithm, QoS-LB) which can be used at the front-end of the server cluster is presented. The front-end switch uses the content information of the requests and the load on the back servers to choose the server to handle each request. At the same time, different Quality of Service (QoS) classes of the customers can be considered as one parameter in the load balancing algorithm. This novel feature becomes more important when service providers begin to offer the same services for customers with different priorities.

Network Load Balancing ServicesComputer scienceQuality of serviceComputer clusterServerRound-robin DNSMobile QoSLoad balancing (computing)Service providerActive queue managementAlgorithmScheduling (computing)

researchProduct